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The following remarks formally respond to the Office action, "crossing the t's and 
dotting the i*s " 

Claims 1-5, 7, 10, and 13-22 were rejected under 35 USC 102 as being anticipated 
by Lee et al, US Patent 6,088,673. Applicants respectfully traverse. 

In the previous Office Action response, applicants discussed this reference in detail 
Although applicants have given reasons for holding that the reference neither anticipates 
claim 1 nor it renders obvious- with and without the amendments introduced in the last 
Office Action, the Examiner has made no comment substantive comment about applicants' 
argument. Rather, the Examiner has simply stated that applicants' arguments are moot, and 
asserted that the new grounds for rejection were Necessitated" by applicants* amendment. 

Applicants respectfully traverse. 
Preliminaries 

First, this is the same rejection that was lodged in the first Office Action (at least in 
connection with a number of the claims). In this Office action the Examiner failed to 
explain where the Examiner departs from applicants' view of the teachings found in the 
reference — since these viewed were presented — and why the Examiner believes that 
applicants' view is incorrect. 

Second, although applicants have made some amendments to the claims, it is 
believed that applicants' amendments did not necessitate the presentation of the instant 
rejection. 

Substantive Argument 

In connection with claims 1 , 1 8-20, 21 , and 22, the Examiner states, inter alia, that 
"[Fjurthermore, Lee et al teaches at least two prosody parameter specification toward a 
target value (col 6, lines 10-19)." (Last sentence in second paragraph of point 2 in the 
Examiner's Detailed Action) 

Perusing the reference, one finds that the cited passage states: 



Text information includes a classification code for a used language and a 
sentence text. Prosody information includes the number of phoneme in 
the sentence, phoneme stream information, the duration of every each 
phoneme, pitch pattern of phoneme, energy pattern of phoneme and is 
used for enhancing the natural of the synthesized speech. The 
synchronization information of the moving picture with the synthesized 
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speech can be considered as the dubbing concept and the synchronization 
could be realized in three ways. 

Applicants accept that this passage teaches various parameters of a synthesized sentence, 
including 

1. the number of phonemes . 
2> phoneme stream information, 

3. duration of every each phoneme, 

4. pitch pattern of each phoneme, 

5 . energy pattern of each phoneme, and 

6. synchronization information of the moving picture with the synthesized 
speech 

The above suggests that the first two clauses of applicants' claim 1 are, indeed, taught by 

Lee et al, per force of points 1 , 2 and 3, above. 

However, applicants' claim 1 also has a third clause, which states: 

including, for at least one of said phonemes, at least two prosody 
parameter specifications, with each specification of a prosody parameter 
specifying a target value for said prosody parameter and a point in time 
for reaching said target value. 

The question, then, is; From points 4, 5, and 6, above, which two prosody parameter 
specifications does the Examiner refer to in his abovenjuoted remark? 

While the Examiner has not revealed what he has in mind, applicants surmise that 
the Examiner was referring to points 4 and 5, since pitch information and energy 
information are clearly prosody parameters, and synchronization information is not. 

Since the third clause of claim 1 specifies that each of the parameters specifies "a 
target value for said prosody parameter" and "a point in time for reaching said target value," 
it follows that there is NO ANTICIPATION UNLESS Lee et al specify both u a target value 
for said prosody parameter" and u a point in time for reaching said target value " 

It is noted that the cited passage teaches nothing about how the pitch pattern of each 
phoneme is specified. Similarly, the cited passage teaches nothing about how the energy 
pattern of each phoneme is specified. However, the passage at col. 5, lines 54-63 does teach 
how the pitch and the energy patterns are specified: that is, three pitch values are specified 
for each phoneme, and three energy values are specified for each phoneme. Applicants' key 
argument is that no "point in time" for reaching these values is specified in reference . 
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Admittedly, there are points in time for these values, but these are known apriori and, 
therefore, need not be specified (and are not specified). 

Applicants wish to observe that the Lee et al is significantly inferior to applicants' 
approach, which is not limited to three specific target values (could be more of them or 
fewer of them) and is not limited to specific instances within a phoneme. It can be specified 
for any "point in time." 

In light of the above, applicants respectfully submit that the rejection of claim 1 in 
view of Lee et al is inappropriate and should be rescinded. Applicants further submit that 
since claim 1 is not anticipated by Lee et al, it follows that all claims that depend on claim 1 
- that is, claims 2-20 - are also not anticipated by Lee et al. 

As for independent claim 21, it is amended herein in the same spirit that claim 1 is 
amended, and applicants respectfully adopt the analysis vis-A-vis the Lee et al reference 
employed above in connection with claim 1, because claim 21 specifies a 'third step/' for 
including a target parameter value at an explicitly chosen time offset. 

In light of the above amendments and remarks applicants respectfully submit that all 
of the outstanding claims are in condition for allowance. Therefore, reconsideration and 
allowance of same are respectfully solicited. 



Respectfully, 
Mark Beutnagel 
Joern Ostermann 
Schuyler (Juackenbusch 
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Appendix - Marked-up version showing changes made 



IN THE CLAIMS: 

1 . (Twice Amended) A method for generating a signal rich in prosody information 

comprising the steps of: 

including in said signal a plurality of phonemes represented by phoneme symbols, 
including in said signal a duration specification associated with each of said 

phonemes, 

including, for at least one of said phonemes, at least two prosody parameter 
specifications, with each specification of a prosody parameter specifying a target value for 
said prosody parameter and [a] any selected point in time for reaching said target value. 

21. (Amended) A method for generating a signal rich in prosody information 
comprising: 

a first step for including in said signal a plurality of phoneme symbols, 
a second step for including in said signal a desired duration of each of said phoneme 
symbols, 

a third step for including at least one target prosody parameter value within a 
duration for at least one of said phonemes at [a] an explicitly chosen time offset from the 
beginning of the duration of said phoneme that is greater than zero and less than the duration 
of said phoneme. 
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